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Abstract 

Least box number coverage problem for calculating dimension of fractal networks 
is a NP-hard problem. Meanwhile, the time complexity of random ball coverage for 
calculating dimension is very low. In this paper we strictly present the upper bound 
of relative error for random ball coverage algorithm. We also propose twice-random 
ball coverage algorithm for calculating network dimension. For many real- world fractal 
networks, when the network diameter is sufficient large, the relative error upper bound 
of this method will tend to 0. In this point of view, given a proper acceptable error 
range, the dimension calculation is not a NP-hard problem, but P problem instead. 
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1 Introduction 

The structure of complex systems across various disciplines can be abstracted and concep- 
tualized as complex networks of nodes and links to which many quantitative methods can 
be applied so as to extract any characteristics embedded in the system [1] [2l [3]. There exist 
many types of networks and characterizing their topology is very important for a wide range 
of static and dynamic properties. Recently, C. Song et al.^\5\ applied a least box number 
covering algorithm to demonstrate the existence of self-similarity in many real networks. 
They also studied and compared several possible least box number covering algorithms, by 
applying them to a number of model and real- world networks 6 . They found that the least 
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box number covering optimization is equivalent to the well-known vertex coloring algorithm 
which is a NP-hard problem. It implied that any least box number covering algorithms 
are heuristic algorithms. Can we avoid the NP-hard problem and simplify the network 
fractal dimension calculation? In this paper we modify the random sequential box-covering 
algorithm!?] and presented a random ball coverage algorithm. The relative error smallest 
upper bound (Supremum) is given theoretically. The simulation experiments shows that no 
matter how large the network diameter is, this upper bound tends to 0.36. So we develop 
another algorithm for calculating dimension which employ only two random ball coverages. 
We also yield the smallest relative error upper bound of this algorithm, which tend to when 
the network diameter is large enough. In this point of view, for a proper acceptable error 
range, the random ball covering algorithm is equivalent to the least box number covering 
algorithm in statistic sense when a network is large enough. There is no need to focus on 
the least box number covering optimization problem when we want to calculate dimension 
of a large diameter network. 

2 Strict upper bounds of random ball coverage 

The least box number coverage [6] and random ball coverage were defined as following. For 
a given network G, a box with diameter r is a set of nodes where all distances dij between 
any two nodes i and j in the box are smaller than r. The least box number coverage is 
the box coverage with the minimum number of boxes required to cover the entire network 
G. In order to correspond to the fractal network definition[6] we use 'open' ball to cover 
the network. So our random ball coverage is little difference with the random sequential 
box-covering algorithm|T. A ball with radius r and center node c is the set of nodes which 
satisfy the shortest path length from the center c to each of them is smaller than r. The 
random ball coverage with radius r as: at each step, we randomly choose a node which has 
not been covered as a center, and cover all the nodes within the distance r to the center. 
The process is repeated until all the nodes in the network were covered. 

Theorem 1: L{2r) < B{r) < L{r), where L(r) is the number of boxes in a lest box 
number coverage with diameter r and B(r) denotes the number of balls in a random ball 
coverage with radius r. 

Proof: 

•.■ can be regarded as the number of boxes in a random box number coverage with 

diameter r — 1. 

:. L{r) < L{r - I) < B{^^) 

Suppose Li, L2, • • • , Lm denote all the boxes in a lest box number coverage, where m = 
L{r), and Li = {nii,ni2, • ■ • ,nifei},i2 = {^^2l,?^22, • ■ • ,?^2fe2}' ' ' ' ; 

Lm = {n-mii nm2, • ' ' , nmk^}- Then we have Li [J Lj — ^ for all i ^ j,i < m, j < m, where 
$ denotes empty set. 

According to the above definition of random ball coverage with radius r, without loosing 
any generality suppose the center of the first ball in a random ball coverage process is nn, 
then Li is covered by the first ball and the second random ball's center must lies out of Li. 
Without loosing any generality we also can assume n2i is the center of the second ball, then 
L2 is covered by the second ball. In this way, we can get that: B{r) < L{r). 
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.-. i?(r) < < 

.-. L{2r) < B{r) < L{r) 

Theorem 2: Suppose G is a fractal network, the available box diameter range is {m, m + 
1,...,R} and we employ linear lest squares regression to get the dimension. Then the 
smallest upper bound of the relative error of dimension calculated by random ball coverage 
is 

log2[fc log -S^-(_R-m+l) log 



e(i?, m) 



where, k e {1, 2, • • • , i? — m + 1} and satisfy: log(m + fc — 1) < < log(m + k) 

Proof: 

■.■ network G is fractal in range {m, m + 1, . . . , R}. 

:. there must exist a proper b for any r £ {to, to + 1, ... , R} such that 

^ogaL{r) ^ -slog^r + b (1) 

log„ L(2r) = -s log„ r + b-s log„ 2 (2) 

Then 

log, B{r) = -s log, r + b-0rs log, 2, Or G [0, 1] (3) 
where, s is the dimension of the network. 
Suppose 



LogaB{r) = -sLogaV + b + 5r, (4) 



log, TO log,(m+l) •■• log,i? 
1 1 •■• 1 

6-(log,B(TO) log,i?(TO + l) ■•• log, ) 

Then (s,&)' = {AA')-^Ab' 

flTiH s - - log 2[(fl-m+i) Ef=^ Si log z-Ef=„ e. -log ^] 
ana ft - 6 6 Ef=„ (logi)='-(log 

.-. the relative error is e = '"^^^^fp ^n'°^ N2 ^i'""'fi i \2°^ "'^ 

(i?-m+l) 21,1=™ (logi)2-(log 

Employing linear programming, the upper bound of the relative error 

log2[fclog :gT-(_R-m+l) log '""^^r^^' ] 



is: e{R,'m) 



(i?-m+l)Ef=„ (logi)"-(logS)" 



where, k e {1, 2, • • • , i? — m + 1} and satisfy: log(TO + fc — 1) < < log(TO + fc) 

Fig[T]show the relationship among R, to, e. 

It is not easy to get any conclusion directly about the upper bound from the above 
expression. So we have done some numerical calculations. It seems that for a given to when 
R becomes large, it has a limit. For instance, when to=1 or 2, its limit is about 0.36. It 
implies that no matter how large the network diameter is, the relative error never be lower 
than 0.36. In fact, we could only use two points B{'m) and B{R) to estimate dimension s 
which is named twice-random ball coverage algorithm. The corresponding upper bound of 
relative error will be tend to when the fractal network diameter tend to infinite. 
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Theorem 3: Suppose G is a fractal network, the available box diameter range is {m, m + 
1, . . . , R}. We just use B{m) and B{R) to estimate network dimension. Then the estimated 
dimension value is s = '"Sa B(m)-\og^ b{r) ^-^^ smallest relative error upper bound is 
\ogR 2. 



Proof: 



Obviously, maxs = si ~ ^°^'\og"^R-\og m^^"* ~ slogn. ^ and 

loE„ -L(2m) — log„ L(R) ^ 2R ii, 

e2(i?,TO) =1 |< max{| |, | |} ^ iog« 2 (5) 

More details are shown in Fig[TJ 

Employing random ball coverage algorithm (twice-random ball coverage algorithm), we 
get the fractal dimension of the world-wide web is s = 4.16(3.61), (R ^diameter, m = 1) 
jwhich is corresponding to the dimension obtained by C. Song et aZ.(dimcnsion is s = 4.1)[1]. 
From our empirical results, we find for many networks, R =diamcter and m = 2 is more 
reasonable. When m = 2 we get WWW network dimension is 4.48(4.51). Sometimes, the 
available box diameter R is not sufficient enough, we can calculate B{r) many time and get 
the dimension. We also test this method in the 43 cellular networks j^. For each network 
(i?=diameter, m — 2) and each B{r), we perform random coverage 100 times. Then we 
get the average dimension of the whole cellular networks is s = 3.54(3.58) which is perfect 
corresponding to the dimension obtained by C. Song et aZ.(dimension is s = 3.5)[4j and 
W. Zhou et al. (dimension is s = 3.54 ± 0.27) [9]. Because we calculate each B{r) 100 
times, for any one of the 43 cellular networks, we can get 100 different dimensions. For 
each cellular network we can get an average variance. The maximum average variance of 
43 cellular network dimensions is 0.042(0.062), the average variance 0.018(0.023). If we 
use the network dimension which is obtained by 100 times calculation to substitute its real 
fractal dimension s in our above discussion, we get the maximum relative error of 100 time 
calculations of the 43 cellular networks is 0.15(0.20). The average maximum relative error 
is 0.045(0.11) and the average relative error is 0.030(0.034). The relative errors of empirical 
results are far less than the theoretical upper bounds respectively. The interesting thing 
is that, in our theoretical discussion, the upper bound of twice-random ball coverage is 
less than the upper bound of random ball coverage algorithm. But the empirical results 
always show the random ball coverage algorithm is better than twice-random ball coverage 
algorithm. So, we think the random ball coverage algorithm is better than twice-random 
ball coverage algorithm in practice. Moreover, our theorems also can be used to estimate a 
network's diameter. 



3 Conclusion and discussion 

In this paper, we strictly present the upper bound of the relative error of random ball 
coverage method in fractal network dimension calculation. And we also yield a simple 
relative error upper bound logij. 2 of twice-random ball coverage method. For many real- 
world networks, when the network diameter is sufficient enough this kind of relative error 
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Figure 1: Relative error plot. R denotes the diameter of a network, e, e2 denote the relative 
errors of random ball coverage algorithm and twice-random ball coverage algorithm. 



upper bound will tend to 0. Therefore, if the network is sufficient enough, twice-random 
ball coverage is equivalent to the leat box number coverage in fractal dimension calculation 
and calculating fractal network dimension is not a NP-hard problem. For the networks 
which is not sufficient enough, we can calculate random ball number many times and get 
the dimension, which is also very effective and accuracy. 

The above discussions can lead another problem naturally. We also can define random 
full box coverage. A full box with diameter r is a set of nodes, such that any other nodes out 
of the box is added to the box will make the box diameter larger or equal to r. The random 
full box coverage algorithm with diameter r can be defined as 6J : at each step we randomly 
choose a uncovered node p as the first node of the box, and select the uncovered nodes to 
the box until the box become full. We guess the random full box coverage algorithm is 
equivalent to the least box number covering algorithm in statistic sense. In the future we 
will do some deep researches about this problem. 
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